Large-scale music similarity search with spatial trees

نویسندگان

  • Brian McFee
  • Gert R. G. Lanckriet
چکیده

Many music information retrieval tasks require finding the nearest neighbors of a query item in a high-dimensional space. However, the complexity of computing nearest neighbors grows linearly with size of the database, making exact retrieval impractical for large databases. We investigate modern variants of the classical KD-tree algorithm, which efficiently index high-dimensional data by recursive spatial partitioning. Experiments on the Million Song Dataset demonstrate that content-based similarity search can be significantly accelerated by the use of spatial partitioning structures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Binary Codes For Efficient Large-Scale Music Similarity Search

Content-based music similarity estimation provides a way to find songs in the unpopular “long tail” of commercial catalogs. However, state-of-the-art music similarity measures are too slow to apply to large databases, as they are based on finding nearest neighbors among very high-dimensional or non-vector song representations that are difficult to index. In this work, we adopt recent machine le...

متن کامل

A Tale of Two (Similar) Cities: Inferring City Similarity Through Geo-Spatial Query Log Analysis Submitted for Blind Review

Understanding the backgrounds and interest of the people who are consuming a piece of content, such as a news story, video, or music, is vital for the content producer as well the advertisers who rely on the content to provide a channel on which to advertise. We extend traditional search-engine query log analysis, which has primarily concentrated on analyzing either single or small groups of qu...

متن کامل

A partition-based algorithm for clustering large-scale software systems

Clustering techniques are used to extract the structure of software for understanding, maintaining, and refactoring. In the literature, most of the proposed approaches for software clustering are divided into hierarchical algorithms and search-based techniques. In the former, clustering is a process of merging (splitting) similar (non-similar) clusters. These techniques suffered from the drawba...

متن کامل

Eyes4Ears - More than a Classical Music Retrieval System

Content-based similarity search for music retrieval attracted a lot attention in recent information retrieval research. Most music applications (e.g. several commercial web portals) offer to search music files, which however is limited to key-word-based search on subjects like genre or artist. Other similarity search approaches base on abstract metrics, which are defined on feature vectors repr...

متن کامل

Efficient multifeature index structures for music data retrieval

In this paper, we propose four index structures for music data retrieval. Based on suffix trees, we develop two index structures called Combined Suffix Tree and Independent Suffix Trees. These methods still show shortcomings for some search functions. Hence we develop another index, called Twin Suffix Trees, to overcome these problems. However, the Twin Suffix Trees lack of scalability when the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011